Nonlinear latent representations of high-dimensional task-fMRI data: Unveiling cognitive and behavioral insights in heterogeneous spatial maps

Finding an interpretable and compact representation of complex neuroimaging data is extremely useful for understanding brain behavioral mapping and hence for explaining the biological underpinnings of mental disorders. However, hand-crafted representations, as well as linear transformations, may inadequately capture the considerable variability across individuals. Here, we implemented a data-driven approach using a three-dimensional autoencoder on two large-scale datasets. This approach provides a latent representation of high-dimensional task-fMRI data which can account for demographic characteristics whilst also being readily interpretable both in the latent space learned by the autoencoder and in the original voxel space. This was achieved by addressing a joint optimization problem that simultaneously reconstructs the data and predicts clinical or demographic variables. We then applied normative modeling to the latent variables to define summary statistics (‘latent indices’) and establish a multivariate mapping to non-imaging measures. Our model, trained with multi-task fMRI data from the Human Connectome Project (HCP) and UK biobank task-fMRI data, demonstrated high performance in age and sex predictions and successfully captured complex behavioral characteristics while preserving individual variability through a latent representation. Our model also performed competitively with respect to various baseline models including several variants of principal components analysis, independent components analysis and classical regions of interest, both in terms of reconstruction accuracy and strength of association with behavioral variables.


S1 Fig: Age distribution in a) HCP and b) UK biobank
The list of UK-bio bank behavioral measures categories [2] is shown S1 Table.

1-4 UMAP hyper-parameters
We used the default parameters of UMAP for training.We tuned the number of neighbors (n_neighbor=15) and minimum distance of the points (min_dist=.1) in a cluster to maintain the balance between the global and local structure of data.Moreover, we selected Euclidean distance for computing the ambient space of data.These parameters provided robust results and better visual separation.

1-5 Normative modeling
Briefly, normative modeling presents a probabilistic interpretation of the deviations of the latent variables (UMAP projections) across all subjects.To measure the deviations, we used the predicted UMAP projections of the normative model for each individual participant, and next converted each to a subjectspecific Z score as described previously [3].
We applied Hierarchical Bayesian Regression (HBR) to perform normative modeling.Here, we modeled the posterior probability of the brain data, Y, given X, i.e.P(Y|X) using MCMC sampling.For the data likelihood, we used a sinh-arcsinh (SHASH) distribution (Jones and Pewsey, 2009[4]), which is derived from passing a Gaussian distributed random variable, x, through the transformation: sinh((sinh −1 () + )/)), where the parameters  and  govern shape of the underlying distribution.We and others have shown that this distribution is well-suited to modelling non-Gaussian data as it can model both platykurtic and leptokurtic distributions and positive and negative skew in the same distributionally family [4][5][6].We used a specific reparameterization of this distribution, referred to as SHASHb, which breaks the dependency between shape parameters, such that  and  relate to the skew and kurtosis respectively (see de Boer et al 2022 for details).To model non-linearity in the mean, variance and skew, we used a cubic b-spline basis expansions with.5 equally spaced knot points.To further increase the flexibility of the model, we added a random effect of sex to the mean, variance, skew parameters.
So briefly, our model is given by: 1.Likelihood probability with SHASHb distribution:

1-6 Comparison with mixed-PCA
In addition to standard PCA, we perform an additional analysis that aimed to also include age and sex in the model to mimic a linear variant of the semi-supervised approach.This helps to understand the contributions of the nonlinear nature of the autoencoder and the inclusion of age and sex.To do so, we conducted a simple 2 stage PCA that accommodate for age and sex in the second stage of PCA.First, we applied PCA to the data, resulting in N components.Then, we added age and sex as separate components to the N PCA components and applied another PCA to reduce the dimensionality to M components (M<N).
This process allowed us to mix the PCA components linearly with age and sex information.

1-7 Comparison with ICA
We applied sklearn's FastICA to vectorized data with 100 components.Following this, we applied UMAP to the ICA components, and then used the transformed UMAP components to calculate the univariate correlation with non-imaging-derived phenotypes (nIDPs).

1-8 Comparison with (Region of interests) ROIs
The underlying assumption here is that ROIs are hand-crafted summaries of the data.Similar to the approach with ICA components, we applied UMAP and measured the univariate correlations.

2-1 Model selection: Hyper-parameters tuning
S2 All the models were trained using NVIDIA P100 GPU, TensorFlow r2.8.The training time for HCP data with N=32K scans was approximately 11.5 minutes per epoch.The fine-tuning time for UKbiobank with N=15K was 2 minutes per epoch.The batch size was 10 for all the models.1-3 Semi-Supervised AE The architecture of 3D semi-supervised AE is shown in S2 Fig.We used 2580 scans from HCP (N=30 participants) for selecting the architecture and hyperparameters of the network.These scans never used again in the training and test phases.

2 .
Regression model with B-spline basis expansion: μi = f(Xi) = ∑ Bj(Xi) * βj where: μi = mean for observation i f(Xi) = B-spline basis expansion of the regression Bj(Xi) = B-spline basis function j evaluated at Xi βj = regression coefficient for B-spline basis function j 3.Random-effects model for mean, variance, and skew parameters: μi = f(Xi) + ui σ²i = g(Xi) + vi γi = h(Xi) + wi where: ui ,vi ,wi = random effects for mean, variance, and skew parameters, respectively f(Xi), g(Xi), h(Xi) are B-spline basis expansions over age for mean, variance, and skew parameters, respectively Inference was performed using Markov chain Monte Carlo inference, using the no U-turn (NUTS) sampler implemented in the pymc3 python package.Note that these equations provide only a brief overview of the HBR model described above.The full details of this approach can be found in [de Boer 2022][5] and the HBR implementation is provided in the pcntoolkit package (https://github.com/amarquand/PCNtoolkit).

The impact of supervised loss coefficient 𝝀 on latent
Table shows the performance of the models with different configurations and image normalization strategies in terms of normalized root mean square of reconstruction error (NRMSE), which is the root mean squared error scaled by the range of the input image.To selection, we trained the model with different values of .S3 Table indicates the impact of  on unsupervised and supervised loss in the small held-out test data set of HCP.As expected,  = 1 fails to predict age and sex.By increasing the supervised loss impact and decreasing  , however, the reconstruction loss increases.= 0.05 shows the better performance in terms of sex and age prediction and reconstruction error, so we use this value for all the main analyses reported in the main text.S3 Table: model performance in terms of loss values for different values of λ